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ABSTRACT 

Four large-area Sunyaev-Zeldovich (SZ) experiments - APEX-SZ, SPT, ACT, and Planck - promise 
to detect clusters of galaxies through the distortion of Cosmic Microwave Background photons by hot 
(> 10 6 K) cluster gas (the SZ effect) over thousands of square degrees. A large observational follow-up 
effort to obtain redshifts for these SZ-detected clusters is under way. Given the large area covered by 
these surveys, most of the redshifts will be obtained via the photometric redshift (photo-z) technique. 
Here we demonstrate, in an application using ^3000 SDSS stripe 82 galaxies with r < 20, how the 
addition of GALEX photometry (Fjjv, Njjv) greatly improves the photometric redshifts of galaxies 
obtained with optical griz or ugriz photometry. In the case where large spectroscopic training sets are 
available, empirical neural- network-based techniques (e.g., ANNz) can yield a photo-z scatter of a z — 
0.018(1 + z). If large spectroscopic training sets are not available, the addition of GALEX data makes 
possible the use simple maximum likelihood techniques, without resorting to Bayesian priors, and 
obtains a z = 0.04(1 +z), accuracy that approaches the accuracy obtained using spectroscopic training 
of neural networks on ugriz observations. This improvement is especially notable for blue galaxies. 
To achieve these results, we have developed a new set of high resolution spectral templates based on 
physical information about the star formation history of galaxies. We envision these templates to 
be useful for the next generation of photo-z applications. We make our spectral templates and new 
photo-z catalogs available to the community at www.ice.csic.es/personal/jimenez/PHOTOZ. 
Subject headings: galaxies, clusters, photometric redshifts, SZ, dark energy, general 



1. INTRODUCTION 

Thousands of square degrees of the sky are currently 
being observed at mm wavelengths by three experiments: 
APEX-SZ 6 , South Pole Telescope (SPT) 7 and Atacama 
Cosmology Telescope (ACT) 8 . In addition, the Planck 
satellite, to be launched this fall, will observe the whole 
sky. The promise of these surveys is to provide a nearly 
mass-selected galaxy cluster sample via the Sun yaev- 
Zeldovich (SZ) effect (jSunvaev fc Zeldovich|[T972] ). Be- 
cause of the lack of sensitivity of the SZ-effect to red- 
shift, clusters or groups of galaxies detected this way need 
follow-up observations at other wavelengths to determine 
their redshifts. The large area of sky covered and the 
large number of expected detections make spectroscopic 
follow-up of galaxies in every cluster prohibitive. Upcom- 
ing surveys will rely on redshifts obtained from broad- 
band photometry (photometric redshifts or photo-z) or 
custom-designed narrow-b and photometry (Mol es et al.l 
120051 : iBemtez et al.l f2008h . As broad-band photometry 
provides low resolution spectral information, the deter- 
mination of galaxy-redshifts can be affected by relatively 
large errors. Photo-z errors can limit the accuracy of 
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cosmological studies using galaxies or clusters, which 
highlights the importance of improving photo-z deter- 
minations. For example, SPT and ACT will attempt 
to constrain the dark energy e quation of state usin g 
SZ selected c luster s and photo-z (jCarlstrom et aLlfeOOSh . 
iLima fc Hul (|2007l ) calculate that a photo-z bias of 0.003 
and scatter of 0.03 will cause a ~10% increase in the 
amplitude of the equation of state error bars achieved 
by SPT using this approach. The above surveys will re- 
quire photo-z not only for SZ clusters but also for field 
galaxies, to carry out ancillary science s uch as exploit- 
ing th e signal of CMB weak lensing (e.g.. ICarbone et al.l 
l2007f ) by large scale structure and the k inetic-SZ effect 
(e.g. jHernandez-Monteagudo et al.l l2006): two powerful 
probes of the growth of structure, which are useful, for 
example, in distinguishing between modified gravity and 
dark energy as the source of the present accelerating ex- 
pansion of the universe. 

The use of broad-band photometry t o determine red- 
shifts is n ot new (see first attempts bv lBaumlll962l and 
iKool H-985). In its minimalistic approach it consists of 
simply finding the best fit redshift using a series of galaxy 
templates, which can be eithe r chosen fr om stellar pop- 
ulation models or empirically (|Koolll9 990 as long as the 
set is exhaustive (i.e. fully describes the galaxy popula- 
tion). With the arrival of large spectrographs, it became 
clear that a refinement of the above technique could be 
achieved by using small subsets of spectroscopic redshifts 
as "training sets" for larger photometric samples. One 
can then use these training sets as inp uts for empiri- 
cal fi ts to the magnitudes versus z (e.g. iBudavari et al.1 
l2005f) or for artificial neural network codes to compute 
photo-z dVanzella et all l2004t ICollister fc Lahavl 120041 : 
lOvaizu et al.ll2007l ) . Another approach is to use prior in- 
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formation about galaxies, like the fact that faint galaxies 
tend to be farther away, as a Bayesian prior for comput- 
ing the redshift likelihood from the templat es (|Bemtezl 
l2000Hllbert et al.ll2006tlFeldmann et al.ll2006[ ). Other re- 
cently developed techniques that go beyond simple pho- 
tometry fits include using structural properties of galax- 
ies like their size or s urface brightness to obtain more 
accurate photo-z fe.g-. IWrav fe Gunnll2007l ). 

The above methods have their pros and cons. For ex- 
ample, methods based on training sets, because of their 
empirical basis, can only be reliably extended as far as 
the spectroscopic redshift limit. Training sets for surveys 
such as the dark energy survey (DES) and the Large Syn- 
optic Survey Telescope (LSST) survey will need of the 
order of hundreds of thousands of spectro scopic redshifts 
(|Connollv et al.lll997t lOvaizu et al.ll2006f ). 

To use Bayesian prior-based methods, one needs to 
construct and test different priors for different redshift 
ranges and surveys, which also requires spectroscopic 
redshifts to accurately generate the prior distributions. 

Given the need to obtain relatively accurate photo-z 
for the large SZ survey areas we have explored an alter- 
native approach. The goal of this approach is to optimize 
photo-z accuracy while minimizing external assumptions 
(priors) and additional data acquisition. 

Our approach, presented in detail below, consists of 
obtaining moderate depth observations with the Galaxy 
Evolution Explorer (GALEX) combined with optical 
griz data. Th i s dat a combination was fi r st tri ed by 
Budavari et~aT1 J2005D . IWav fe Srivastava I (|2006f ). and 
Ball et all (|2007f ). who used empirical approaches with 
spectroscopic training sets for photo-z determination. 
Adding the two GALEX broad bands at central wave- 
lengths of -1500 A (Fuv) and -2300 A (N uv ) to op- 
tical griz photometry, improves photo-z determinations, 
while requiring minimal assumptions about external pri- 
ors, for the following reason. The 4000 A break, which 
is the most commonly used spectral feature for optical 
photo-z determination, is greatly reduced for blue galax- 
ies, making it more difficult to use as a redshift indicator. 
This problem is particularly acute at z > 0.5, where most 
galax ies are young and h ave high star formation rates 
(e.g.. iHeavens et alJ[2003 ). The 912 A Ly man-limit, on 
the other hand, is exhibited by all galaxies (Fig. [2]). Fil- 
ters that sample closer to the Lyman-limit help to pin 
down the galaxy type and redshift, especially for blue 
galaxies with no substantial 4000 A break. Further, given 
the strong sensitivity of the UV to star formation, one 
can directly obtain a measure of star formation. 

In carrying out this work, we found that galaxy tem- 
plates with well motivated blue spectra (in particular, 
blue-wards of 3000 A) are not publicly available. Ei- 
ther thi s region of the spectrum was missi ng (like in the 
original IColeman. Wu. fe Weedmanl 119801 templates) or 
it was modeled roughly with no spectral features beyond 
the Lyman-limit. Motivated by the need to provide re- 
liable empirical templates in this region of the spectrum 
and higher spectral resolution than currently available 
models, we have developed our own templates. To do so 
we have exploited our knowledge of the star formation 
history of galaxies over cosmic time (jPanter et al.ll2007j ) 
to help us build physically motivated templates. 

We present a test of the performance of this approach 



on spectroscopic samples from the Sloan Digital Sky 
Survey (SDSS) stripe 82 region. We find that, in the 
case where large spectroscopic training sets are avail- 
able, empirical neural-network-based techniques (e.g., 
ANNz ICollister fc Lahavl[200l lOvaizu et al.ll200l give 
a o z = 0.018(1 + z) for optical photometry combined 
with GALEX observations. If large spectroscopic train- 
ing sets are not available, the addition of GALEX data 
make possible the use of simple maximum likelihood 
techniques, without resorting to Bayesian priors, and ob- 
tains a z = 0.04(1 + z), which approaches the accuracy 
obtained using spectroscopic training of neural networks 
on ugriz observations. In particular, we show how the 
large number of catastrophic failures that occur for griz- 
based and ugriz-based maximum likelihood photometric 
redshift determinations is nearly eliminated by adding 
UV photometry from GALEX data. The improvement 
is especially notable for blue galaxies with g — r < 0.6, 
for which photo-z scatter of 0.03(1 + z) is achieved on 
galaxies with r < 19 and z < 0.25. As noted below 
and by lllbert et alj (|2006f ). the absence of the u band 
significantly degrades the performance of the photo-z es- 
timation. We show that the addition of GALEX UV ob- 
servations is preferable to the addition of optical u band 
observations. 

The rest of the paper is organized as follows: in §2 we 
describe our source sample. In §3 we present our method 
and details of the implementation. In §4 we discuss our 
results. Conclusions are presented in §5. 

2. SOURCE SAMPLE 

Our GALEX observations comprise a Legacy program 
awarded in cycle 3, with the goal of mapping —100 deg 2 
with 3 ks exposure time per pointing in both the Fjjy 
and Nuv filters. We chose to map roughly 11 deg 2 cover- 
ing the Blanco Cosmology Survey 9 (BCS) 23-hour field at 
declination -55° and a larger area of the equatorial stripe 
82, which is covered by SDSS. Both areas have griz ob- 
servations, and SDSS also has u observations. The BCS 
field is part of the common SZ area survey; however, as 
there is currently no significant sample of spectroscopic 
redshifts in the BCS region to validate our photo-z, BCS 
analysis will not be presented here. The SDSS stripe 82 
has been observed by ACT and (of course) offers a sam- 
ple of SDSS spectroscopic redshifts to test the photo-z 
performance. We took advantage of the fact that the 
stripe 82 survey area includes a number of the GALEX 
Medium Imaging Survey (MIS) fields, which already had 
many > 1.5 ks observations and therefore needed only 
partial additional observations to reach our 3 ks target. 
In total we will collect —210 ks of integration time - 
merely 2.4 days of observations. 

At the time this analysis was completed, only about 
half of the planned observations had been made. The 
stripe 82 data set used for this analysis is comprised of 
56 GALEX fields (—55 deg 2 of coverage, although some 
field edges lie outside the SDSS stripe 82 region) to Nuv 
depths between 2 ks and 6.5 ks (Fig. [T|). These depths 
allow us to probe deeper magnitudes and a more com- 
plete sample than has been possibl e with previous photo- 
z studies that used GALEX data dBudavari et al.1 120051 
IWav fc Srivastava I [20061 : iBall et all l2007lh Of those 

9 cosmology.uiuc.edu/BCS 
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Fig. 1. — Distribution of exposure times on the GALEX fields 
used in this analysis for Nuv (black) and Fuv (red) observations. 
Fields were only used with Nuv exposure > 2 ks. 

fields, 41 are publicly available MIS data, and the other 
15 are from our guest investigator proposal. 10 

2.1. Magnitudes 

Accurate photometry is critical for obtaining accurate 
photo-z. Because of the differences in the point spread 
functions (PSF) of different instruments and between 
bands, simple aperture photometry is not appropriate for 
this study. The SDSS PSF width s are approximately 1.5 " 
and vary with sky brightness (jAbazaiian "eFall l2003h . 
while GALEX PSF widths vary across t he field between 
roughly 4" and 7" (Mo rrisev et al.ll2005t ). Our approach 
is to use AB magnitude measures that are as close as 
possible to the total flux emitted by the galaxy in each 
band. 

As part of the standard GALEX pipeline for each 
field, SExtractor is run on both the Fuv & n d Nuv im- 
ages to extract multi-pixel sources that are detected 
abov e the noise threshold in background-subtracted im- 
ages (|Bertin fc Arnoutsll996f) . We use the Nuv and Fuv 
mag_auto outputs of SExtractor, which optimizes ellipti- 
cal apertures for each source to integrate the total flux. 
The Fuv bandwidth and transmission are both roughly 
a factor of two smaller than the Nuv (Figure [2]), causing 
it to have substantially lower sensitivity. Because of this, 
far fewer sources are independently detected in the Fuv 
band. 

For the SDSS data we explored the use of both C-model 
and model magnitude measurements. These magnitudes 
consist of fitting models to the profile of the galaxy com- 
posed of an exponential disc and a deVaucouleurs pro- 
file. The fits are integrated to three and seven times 
the characteristic radius respectively, at which point the 
function is truncated to smoothly go to zero within one 
additional characteristic radius. For the model magni- 
tudes all bands are measured using the best fit model 
to the r-band data, while for the C-model magnitudes, 
the two fits are weighted based on the quality of the fit 
and combined to obtain the best fitting profile for each 
filter band. 12 The C-model measurement provides the 
best estimate of the total photometric flux for each SDSS 
band. 13 While testing the template-fitting photo- z tech- 

10 Nicmack (2008) describes the source sample in more detail. 

1 1 galex.stsci.edu / GR2 / ?page=ddfaq#2 

12 www.sdss.org/dr5/algorithms/photometry.html 

13 While it may not be the case for low signal-to-noise cases, 
in the high signal-to-noise regime this procedure yields the total 
photometric flux and is not affected by systematic errors. For the 
objects considered the SDSS signal-to-noise is > 5. 



niques fi )3.2p we found that model magnitudes provide 
a better relative calibration when c omparing only SPSS 
bands (especially after adding the iPadmanabhan et "ail 
(2007) "ubercalibration" corrections); however, the C- 
model magnitudes provide a better absolute calibration 
for comparing with other instruments, such as GALEX. 
Both model and C-model magnitudes were also tested 
using the ANNz analysis described in £|3.3i and no sig- 
nificant differences were found between the results using 
the different magnitudes. The ANNz results presented 
in £14.21 were calculated using model magnitudes. 

Magnitude corrections of -0.04 and +0.02 are applied 
to the SDSS u and z bands respectively to convert 
from SDSS magnitudes into AB magnitudes. 14 All re- 
ported magnitudes are in the AB system. In fj4] we as- 
sess the performance of our photo-z analysis on those 
SDSS galaxies with spectroscopic redshifts with confi- 
dence > 0.9. SDSS objects are excluded from the catalog 
using the "blended," "nodeblend," and "saturated" flags. 
The majority of the SDSS spectroscopic measurements 
have r < 18, although, there are also a substantial num- 
ber of spectroscopic measurements between 18 < r < 20 
(which are primarily Luminous Red Galaxies), so we have 
limited our current analysis to the r < 20 magnitude 
regime (except as discussed in Fig. [8] and ^4.3|) . 

We emphasize that by using total magnitudes for each 
band we minimize the potential problem of missing light 
because of choosing an aperture in one band that does 
not encompass all the light in other bands. Measur- 
ing the total light is important when using the GALEX 
bands both because of the different PSF sizes and be- 
cause most of the star formation in galaxies takes place 
at the galaxy perimeter; thus, a fixed aperture based 
on a single optical band can exclude much of the light 
from recent star formation, which is measured by the UV 
bands. 

2.2. Catalog matching 

The GALEX and optical catalogs are merged as fol- 
lows: we initially assign optical sources to a GALEX 
field pointing if they fall within 35.1' of the GALEX 
field center. This cuts the noisiest region of the GALEX 
fields (near the edges), while maintaining complete sky 
coverage between neighboring fields (i.e. leaving no 
gaps). Within every GALEX field, each optical source 
is matched to the nearest GALEX object with a Nuv 
detection within a 4" radi us, which is a relativ ely con- 
servative matching radius (|Agueros et al.ll2005f h After 
all sources in the field are assigned, the combined cata- 
log is searched to test whether any two optical sources 
are assigned to the same GALEX object. When there 
are overlapping assignments, the closest source to the 
GALEX position is selected and the other is removed 
from the catalog. 15 Sources that do not have a GALEX 
detection or overlapping assignments are kept in the cat- 
alog for spectroscopic confirmation tests. We character- 
ize the distributions of GALEX Fuv and Nuv magni- 
tudes in each field using histograms with 0.1 magnitude 
bins. The magnitude limit used for other sources in the 

14 www.sdss.org/dr6 / algorithms / fluxcal.html#sdss2ab 

15 Removing sources with overlapping assignments was also ex- 
plored and had negligible impact on the results in this paper since 
so few sources had overlapping assignments. 
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same field during photo-z analysis is set to be the highest 
magnitude where the number of galaxie s exceeds half o f 
the number at the peak magnitude bin (|Niemac k 2008). 
Sources with magnitudes higher than this limit (as well 
as objects with no Njjv detection) are labeled as non- 
detections, and this magnitude limit is used for the non- 
detections in the photo-z calculation (SJ3J). 

In 56 GALEX fields in stripe 82, -3000 SDSS sources 
with spectroscopic redshifts were found that meet the 
above criteria. Of these sources, 75% were found to have 
Njjv detections within the 4" matching radius, and only 
two pairs of them were matched to the same GALEX 
source. Only seven of the sources had r > 20, which we 
treat as the magnitude limit of the spectroscopic analy- 
sis. When generating the photo-z catalogs, we also use 
SDSS objects without spectroscopic data. In the same 56 
GALEX fields almost 150,000 SDSS sources were found 
with magnitude r < 21, and 55% of those were success- 
fully matched to GALEX sources. Less than 1% of those 
were excluded because they were matched to the same 
GALEX source as another optical source. Both cata- 
logs were also searched for SDSS sources with multiple 
GALEX sources within a 4" radius, and none were found 
in the spectroscopic catalog, while nine were found in the 
photometric sample and were removed from the catalog. 

3. METHODS 

After adding the GALEX bands, we consider two 
different approaches for computing the photo-z. First 
we describe a spectral energy distribution (SED, or 
template-based) photo-z calculation technique and the 
new SED templates that we have developed. This ap- 
proach assumes no prior knowledge of the redshift distri- 
bution and does not require spectroscopic measurements. 
Our second approach is to analyze the same GALEX 
plus optical catalog using ANNz techniques to train the 
photo-z calculation with the SDSS spectroscopic mea- 
surements. 

To quantify the accuracy of different photo-z analyses, 
we define the redshift error as 



where z p h is the photo-z and z sp is the spectroscopic 
z. The mean, z bias , and standard deviation, o~ z , of dz 
(i.e. the photo-z bias and scatter) are calculated for all 
galaxies with z p h < 1, which is motivated by the fact 
that given the optical and UV depths, we do not expect 
to detect galaxies near or above z = 1. In the SDSS 
results presented, these z > 1 failures amount to less 
than 1% of the galaxies in the spectroscopic catalog and 
~1% of the GALEX detected galaxies with r < 20 in 
the photometric catalog. A final cut is made on objects 
with Njjv — g > 1 as this color is typical of QSO's rather 
than galaxies. This cut also removes less than 1% of 
the complete SDSS spectroscopic catalog and ~3% of the 
GALEX detected galaxies with r < 20 in the photometric 
catalog. 

The analysis is done on different combinations of the 
seven optical (SDSS) and UV bands. This allows us to 
study the impact of including different bands on photo-z 
accuracy and thus estimate the importance of different 
bands for future observations. Our photo-z are then com- 
pared to the recently published results of the SDSS ANNz 



photo-z pipeline (henceforth ANNz; IOvaizu et al.ll2007h . 
which was developed using a spectroscopic training and 
validation set comprised of —640,000 galaxies (Fig. [3]). 

We note that when reporting standard deviations to 
study the performance of the photo-z we have not ex- 
cluded outliers (with the exception of cutting the small 
number of galaxies with z > 1). Excluding or down- 
weighting outliers is common practice in the photo-z lit- 
erature, motivated by the fact that the photo-z error 
distribution often is a Gaussian around the peak but has 
long tails. As we quote standard deviations with the 
outliers included, caution is needed when comparing our 
numbers with those in the literature. In particular, for 
the maximum likelihood analysis presented in Fig. O the 
standard down- weighting of the outliers would reduce the 
ugriz photo-z scatter by 20% and the GALEX + griz 
photo-z scatter by 15%, while having a much smaller ef- 
fect on the SDSS ANNz performance. 

3.1. New spectral templates for photo-z 

For template-based photo-z calculation, we need a 
basis of spectral templates that represents galaxies in 
the redshift range of interest and for the magnitude 
range of the catalog. The approach in the litera- 
ture so far has been to either use empirical templa tes 
(jColeman. Wu. fc Weedrna~r]ll980l : lKinnev et al.lll996h or 
use synthetic models with simple receipes to model the 
star formation law in galaxies, most typically using a de- 
clining exponential. 

Recent advances in both observations and stellar mod- 
eling have allowed different groups to dete rmine the com- 
plete star formation h i story of galaxies (iHeavens et all 
12004 iFernandes et~aT1 120051 : iPanter et alJ 120071 ) for a 
wide range of gala xy stellar masses (10 7 — 10 12 M Q ). 
IPanter et alJ j2007) found that stellar mass is the pa- 
rameter that most directly determines the galaxy's star 
formation history and SED. Taking advantage of this 
finding, we use six mass ranges with their correspond- 
ing reconstructed star formation histories, to obtain six 
spectral templates. These templates should encompass 
the entire galaxy population and are therefore a repre- 
sentative basis of galaxies in the universe. The spectral 
templates are built using the input star formation history 
with solar metallicity (changing the template metallicity 
has litt le impact on the final pho to-z performance) us- 
ing the iBruzual fc Chariot! (|2003l ) models. The models 
have only absorption lines, so for the star-bursting galaxy 
tem plate (SED5 in Fig.p ) we use the emission lines from 
the iKinnev et al.l (|1996f ) models and add them to this 
template only. The new templates - shown in Fig. [2] 
- provide a higher resolution and wider spectral range 
than other publicly available templates. Note that we 
have not adjusted the templates to obtain the best pho- 
tometric redshifts, but rather we have used the physical 
knowledge of the recovered star formation history of the 
universe as our input. We evaluate the performance of 
these new templates below. 

3.2. Template fitting photo-z calculation 

With the addition of the two GALEX bands, our 
template-based methodology to obtain photo-z is fairly 
simple. We use the six galaxy templates in Fig. [2] and 
perform a maximum likelihood (ML) analysis, which is 
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Fig. 2. — The top panel shows the six galaxy templates that are 
used to fin d th e maximum likelihood solution for the photometric 
redshifts f i|3.2l l. The vertical dashed lines show the central fre- 
quencies of the GALEX and SDSS bandpasses. The middle panel 
shows the two GALEX bands (Fjjy , Nuv) as well as the five SDSS 
bands (w, g, r, i, z). The bottom panel shows the templates red- 
shifted to z = 1. As the different galaxy types are redshifted, a 
redshift-brightness degeneracy arises in the optical bands (espe- 
cially when only considering griz bands) for the galaxies with blue 
spectra. The addition of the GALEX bands breaks this degeneracy 
by sampling out to the 912 A Lyman-limit. Note that by z = 1 
the Lyman-limit has shifted out of the Fjjy band, but it does not 
reach the central frequency of the more sensitive Nuv band until 
z as 1.5. 



simply a chi-square minimization, to find the best fit- 
ting model to the observed photometry. No priors are 
used for this analysis, or more accurately, we assume flat 
redshift and template prio rs. We use two codes to com- 
pute the photo-z: BPZ 16 (|Bemtedl2000( ) which has the 
ability to simultaneously calculate photo-z using both 
ML analys is and a Bayesian pri or for comparison, and 
ZEBRA 17 (|Feldmann et al.ll2006f ) which is a recently re- 
leased independent code that uses similar techniques to 
BPZ. Most of our analysis will be done using BPZ be- 
cause when including the UV data it performs signifi- 
cantly better than ZEBRA on our data set in the redshift 
range 0.25 < z < 0.4; although, we note that slightly 
better results can be obtained by ZEBRA at z < 0.25. 18 
The observed magnitudes are matched to the pre- 
dicted spectral energy distributions through each band- 
pass from the templates in Fig. [2j As suggested by 

16 Code version bpz. 1.98b; acs.pha.jhu.edu/~txitxo/bpzdoc.html 

17 www.exp-astro.phys.ethz.ch/ZEBRA/ 

18 The same differences between BPZ and ZEBRA were observed 
with a variety of galaxy templates when using the GALEX data; 
however, when the optical data is analyzed without GALEX data, 
BPZ and ZEBRA provide nearly identical results. 



iBemted ((2000), two points of interpolation are allowed 
between the different templates in color space, which al- 
lows the best fit template to be a (2:1 or 1:2) mix of two 
neighboring templates. The photo-z computation is set 
to have a precision of 8z — 0.01. The only limit imposed 
in the ML calculation is a sharp prior z < 1.5; further, (as 
described above) we exclude from the sample the small 
number of sources with photo-z > 1. 

3.3. Artificial neural-network photo-z calculation 

We also co nsider the empirica l phot o-z method ANNz 
developed bv lCollister fc Lahavi (|200l . We compare the 
performance of our template-bas ed photo-z method to 
the results of lOvaizu et alj (|2007f ). who trained and vali- 
dated their artificial neural network on 640,000 galaxies 
with ugriz SDSS photometry and provide a photo-z cat- 
alog for SDSS galaxies with r < 22. We use their photo-z 
determinations for the galaxies in our sample as a bench- 
mark to compare the performance of our template-based 
technique (SU and Fig. El [1 [3 and ©. ANNz in this 
case yields a photo-z scatter of a z = 0.027(1 + z). 

To explore the photo-z potential of GALEX observa- 
tions in more detail, we also use the publicly available 
ANNz code 19 with our combined GALEX and SDSS 
catalog to obtain more accurate photo-z (henceforth 
ANNzG). We use as a training set the SDSS galaxies in 
our GALEX fields that also have spectra. Of the ~3000 
objects with SDSS spectroscopic redshifts in our catalog 
700 are used as our training set and 400 as our validation 
set. We then re-run ANNz on the full ^3000 objects to 
estimate its performance f H4.2[) . Two different network 
architectures were explored for the ANNzG analysis: one 
with five hidden layers with 10 nodes each and three com- 
mittee members, and a simpler version with two hidden 
layers with 10 nodes each and no committee. The more 
complex architecture did result in a slight reduction of 
the photo-z scatter; however, the relative results of us- 
ing different data combinations were nearly identical. We 
present the results of the simpler network analysis here. 
Note that all galaxies in the specified magnitude range 
are included in the ANNz and ANNzG analysis, since 
(unlike the template-based analysis) there are no galax- 
ies with photo-z > 1, and because of the nature of the 
ANNz calculation, it can also simultaneously calculate 
the photo-z for the bluest objects, such as QSOs. 

4. RESULTS 

4.1. Photo-z analysis with no priors 

The addition of GALEX data to the optical measure- 
ments alleviates the redshift-brightness degeneracy and 
greatly improves the photo-z estimation. In Fig. El 
the upper-left panel shows the ML recovered photo-z 
when using only griz data. As expected, the number 
of catastrophic failures is high, resulting in a large stan- 
dard deviation of rj 2 = 0.17(1 + z). Addition of the 
u band data (upper-right panel) reduces the number 
of catastrophic failures and halves the standard devi- 
ation to a z = 0.08(1 + z). 20 Including the GALEX 
data (middle panels) reduces the standard deviation to 

19 zuserver2.star.ucl.ac.uk/~lahav/annz.html 

20 We note that the standard deviation of the ugriz analysis is 
reduced by a 25% to <r z = 0.06(l+z) if SDSS model magnitudes are 
used. These are the best internally calibrated magnitudes for SDSS 
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Fig. 4. — Errors in photo-z estimation versus spectroscopic red- 
shift. The mean (top) and standard deviation (middle) of dz are 
shown as a function of redshift. We compare the ML photo-z re- 
sults using the SDSS griz data (green), which is representative of 
the BCS measurements, as well as the SDSS ugriz data (red), the 
GALEX + griz data (black), and the SDSS ANNz results (blue). 
The GALEX + griz data approaches the standard deviation of the 
ANNz results without the use of priors or training sets. We also 
show the improvement that can be achieved in the GALEX + griz 
analysis by using the ZEBRA code for z ph < 0.25 and BPZ code 
otherwise, which we call BPZebra (black dashed). In the bottom 
panel, the total number of sources in each z bin is shown (black) 
as well as the total number of sources with a GALEX detection 
(red). 

a z = 0.04(1 + z) and removes nearly all catastrophic fail- 
ures. Note that the addition of u data has negligible 
effect when the GALEX bands are added. The bottom- 
left panel shows the comparison with ANNz. The stan- 
dard deviation of ANNz is ~30% smaller than GALEX 
+ griz; although, we note that ANNz is being compared 
to a sample that includes its own training and valida- 
tion set, which makes the comparison a bit unfair. We 
find that simply adding the GALEX moderate exposures 
to griz imaging and using ML analysis techniques with 6 
empirically motivated galaxy templates provides photo-z 
approaching the accuracy of ANNz on ugriz. 

We explore the performance of the photo-z in more de- 
tail in Fig. [5J [5J and [5] to investigate the dependence on 
redshift, magnitude, and color, respectively. In Fig. [4] 
we show how the photo-z bias and scatter evolve as 
a function of redshift. Adding the GALEX data dra- 
matically reduces the bias and scatter over the optical 
bands alone at z < 0.3, beyond which the proportion 
of galaxies with GALEX detections falls off at the cur- 
rent GALEX observation depths (Fig. 0] bottom panel) . 
Still, the performance approaches the level of ANNz up 



Fig. 5. — Errors in photo-z estimation versus source r magnitude. 
The mean (top) and standard deviation (middle) of dz are shown 
in different r bins. (Colors are the same as Fig. l4l and \E\) The 
GALEX + griz data approaches the standard deviation of the 
ANNz results without the use of priors or training sets. At the 
bottom, the total number of sources in each r bin is shown (black) 
as well as the total number of sources with a GALEX detection 
(red). 

to near z « 0.4. In Fig. [5] we show the photo-z bias and 
scatter as a function of the source r magnitude. Both 
remain nearly flat in the regime r < 19, above which 
the fraction of galaxies detected by GALEX falls to less 
than 1/2. These plots clearly indicate that with deeper 
GALEX exposures, we can expect to improve our results 
for fainter objects and higher redshifts. In Fig. [5] the 
photo-z performance as a function of galaxy color is ex- 
amined. The scatter is equivalent to (or possibly even 
lower than) ANNz for g — r < 0.6 and is only slightly 
larger up to g — r » 2. When compared to the other ML 
methods without GALEX photometry, the addition of 
GALEX bands returns significantly more accurate photo- 
z for colors as red as g — r = 1.4. 

We also consider removal of the excess of galaxies in 
the lowest BPZ redshift bins (z < 0.03). Cutting these 
galaxies results in a ~5% reduction of the standard devi- 
ation and almost a factor of ten reduction in bias (Fig[3J 
middle panels). At z p h < 0.25 ZEBRA photo-z scatter 
is ^8% smaller than BPZ and does not show the ex- 
cess in the lowest z bin. A hybrid technique (ZEBRA at 
Zph < 0.25 and BPZ at z ph > 0.25, which we call BPZe- 
bra) can be used to take advantage of the fact that ZE- 
BRA does not have a pile- up of galaxies at low-z, which 
reduces the total scatter by ~8% and reduces the total 
bias by a similar amount to cutting the low-z galaxies 
(dashed line in Fig. 01 . 
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Fig. 6. — Errors in photo-z estimation versus color, g — r. The 
mean (top) and standard deviation (middle) of dz are shown in 
different g — r bins. (Colors are the same as Fig. [4] and [5]) For 
the low g — r bins, or blue galaxies, the standard deviation of the 
GALEX + griz results are a huge improvement over the SDSS only 
data and are essentially equivalent to the ANNz results without the 
use of priors or training sets. At the bottom, the total number of 
sources in each g—r bin is shown (black) as well as the total number 
of sources with a GALEX detection (red). 

4.2. ANNz analysis for improving stripe 82 photo-z 

The performance of ANNz using GALEX data 
(ANNzG, §3.3!) is explored with several combinations of 
SDSS bands. 2 The best performance is (not surpris- 
ingly) obtained with all five SDSS bands and GALEX 
(ANNzG: ugriz); in this case the scatter is a z — 
0.018(1 + z). Removing the SDSS u (ANNzG: griz) or 
u and z (ANNzG: gri) bands only causes slight degrada- 
tions in the photo-z scatters to a z « 0.020(1 + z). As a 
systematic test of our ANNzG approach, we run the same 
analysis on the SDSS ugriz data in our catalog and find 
that it gives a z = 0. 026(1 + z), which is consistent with 
the scatter from the lOyaizu et al.l <|2007f ) ANNz pipeline 
on this data set of a z rj 0.027(1 + z). These results com- 
bined with the results in £14.11 indicate that the GALEX 
bands provide superior redshift information to the SDSS 
u and/or z bands. 

In Fig. [7] we explore the color, g — r, and magnitude, r, 
dependence of the photo-z scatter from the ANNzG and 
ANNz calculations. The addition of the GALEX data 
results in a clear and significant reduction in scatter for 
nearly all color and magnitude bins, with the exception 
of the reddest (high g — r) and brightest (low r) galax- 
ies. The general consistency between trends in the ANNz 

21 We note that the training and validation sets are a random 
sub-sample of 36% of the catalog. 




Fig. 7. — Artificial neural network photo-z scatter as a function 
of galaxy color, g — r (top panel), and magnitude, r (bottom panel), 
analyzed as described in i|3,3l for different combinations of SDSS 
and GALEX data. The s catter is compared to the performance 
of the IQvaizu et al.l II2007T ) photo-z pipeline (ANNz, SDSS ugriz, 
blue) on the same data set. The addition of GALEX data to ugriz 
data (ANNzG: ugriz, green) provides the best photo-z predictions, 
while GALEX combined with griz (ANNzG: griz, red) and even 
just gri (ANNzG: gri, black) only results in slight increases in 
scatter compared to the complete data set. This indicates that the 
GALEX data provides more redshift information than the SDSS u 
or z bands. As a systematic check, we have run identical ANNz 
analysis on the SDSS ugriz data without GALEX (No GALEX 
ugriz, magenta), an d we find that the sc atter distribution is similar 
to that recovered by[Oyaizu et al. ( 2007), both as a function of g — r 
and r. The galaxy distributions in the scatter bins are those shown 
in the bottom panels of Fig. [5] and [6] 

pipeline results (ANNz, SDSS ugriz in legend) and our 
own analysis applied to the SDSS only data (No GALEX 
ugriz in legend) is an indication that our ANNzG results 
are robust. The photo-z bias was also explored, and it 
was found to be roughly five to twenty times smaller than 
the scatter in each bin, so we do not discuss it further. 

4.3. Public Photo-z Catalogs 

Here we apply our ML and ANNzG (gpj) ap- 

proaches for calculating photo-z to stripe 82 galaxies that 
do not have spectroscopic data. The redshift distribu- 
tions from these analyses as well as the SDSS ANNz 
pipeline are compared in Fig. [8l The top and bottom 
panels show the redshift distributions for galaxies with 
r < 19 and r < 21, respectively. Because of the lack 
of SDSS spectroscopic observations for GALEX detected 
galaxies with r > 19 (Fig. EJ and z > 0.3 (Fig. [4]), the 
current ANNzG analysis does not have accurate train- 
ing above this limit. The excess number of galaxies at 
z « 0.3 in the lower panel of Fig. [8] is due to the re- 
sulting failure of the ANNzG analysis for galaxies with 
r > 19. As expected, this clearly indicates that to use 
empirical photo-z techniques one must ensure that spec- 
troscopic training sets are representative of the complete 
photometric sample. 

The ML analysis, on the other hand, does not require 
spectroscopic training, and the lower panel of Fig. [8] 
shows that the ML analysis on galaxies detected by 
GALEX produces a similar distribution to the SDSS 
ANNz pipeline even for dimmer galaxies with 19 < r < 
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Fig. 8. — Comparison of photo-z distributions for SDSS data 
with GALEX detections. The top panel shows data with r < 19. 
The ML analysis (black), the ANNzG analysis (red), and the SDSS 
ANNz results (blue) are all relatively similar in this magnitude 
regime. The bottom panel compares the same three analyses on 
data with r < 21. The ANNzG analysis clearly fails here, because 
the current training set utilizes SDSS spectroscopic measurements, 
which have a lower magnitude distribution (Fig. [5] bottom panel). 
The ML analysis distribution has some failures as well, but it has 
significantly less bias than the ANNzG analysis. The excess in the 
lowest z bin has been observed in all ML analyses with GALEX 
data (Fig. [3). In our public catalogs, the ANNzG and ML results 
are provided for all galaxies with r < 19, while ML results are 
provided for all galaxies detected by GALEX with r < 21. 

21. The primary difference between the ML and ANNz 
distributions is the excess in the ML lowest redshift bin, 
which is a known failure of the BPZ code used for this 
analysis (described in ij4.ip . 

The resulting catalogs containing both the ML and 
ANNzG analyses for galaxies observed by GALEX 
will be made publicly available for the community at 
www.ice.csic.es/personal/jimenez/PHOTOZ. The cata- 
logs will be updated with more complete versions as our 
GALEX stripe 82 observations are completed. Since the 
ANNzG analysis has only been trained up to r w 19, we 
provide an ANNzG photo-z catalog for all SDSS galax- 
ies up to this limit as well as ML photo-z on those same 
galaxies. Since the ML analysis does not require spectro- 
scopic training, we also provide a catalog with ML photo- 
z estimates for all SDSS galaxies that have GALEX de- 
tections and r < 21. 

5. CONCLUSIONS 

In order to obtain accurate photometric redshifts as 
efficiently as possible for the areas surveyed by SZ exper- 
iments, we have obtained moderate-depth GALEX pho- 
tometry. With a modest observing campaign, and using 
already available MIS observations, we have already cov- 
ered an area of ~60 deg 2 to a mean depth of ^3 ks. At 
the completion of our ^210 ks of observations, we will 
h ave covered ~ 10 deg 2 to this depth. 

iBudavari et alj (|2005[ ) previously used ugriz SDSS 
DR1 photometry together with GALEX Medium Imag- 
ing Survey (MIS, 1.4 ks exposure) Fjjv and Nyy pho- 
tometry to determine photo-z for about 10000 galaxies 



up to z sa 0.25. They use an empirical technique which 
relies on a training set of about 6000 objects, and ob- 
tained photo-z errors of a z — 0.026 on the training set, 
which is similar to the SDSS ANNz performance. As 
large training sets and u-band data may not be available 
for the next generation large-area SZ cluster surveys and 
as it-band photometry may not be available for future 
optical surveys such as BCS, DES, and LSST, we have 
considered two cases. 

To be independent of training sets, we considered a 
spectral-energy-distribution, or template-based, photo- 
z approach. As we have found that suitable templates 
for use with GALEX observations were not publicly 
available, we have constructed new, physically moti- 
vated, spectral templates. They are publicly available 
at www .ice.csic.es/personal/j imenez / P H O T O Z . 

Using the SDSS spectroscopic survey we have shown 
that the addition of GALEX photometry to only griz 
bands makes possible the use of simple maximum like- 
lihood techniques, without resorting to Bayesian priors. 
This approach obtains a z = 0.04(1 + z) for r < 20 galax- 
ies, which includes luminous galaxies up to z ~ 0.4. 
This accuracy approaches that obtained using spectro- 
scopic training of neural networks on ugriz photometry 
of the same galaxy sample. In particular, we have shown 
that the large number of catastrophic failures that oc- 
cur for grzz-based and ugriz-b&sed maximum likelihood 
photo-z determinations is nearly eliminated by adding 
UV photometry from GALEX data to griz data. The 
improvement is especially notable for blue galaxies; for 
galaxies with g — r < 0.6, we obtain photo-z scatter of 
~0.03(1 + z). We find that the addition of UV obser- 
vations to griz photometry, provides significantly better 
photo-z than the addition of M-band observations. 

Beyond z w 0.4, the GALEX ~3 ks exposures do not 
have a sufficient number of detections to dramatically im- 
prove the griz observations. Clearly, moderately deeper 
observations would help to bring the utility of GALEX 
observations closer to z « 1. We note that the current 
depth of z sa 0.4 looks back through roughly 33% of the 
age of the universe and samples a volume of 15 Gpc 3 . 
Maybe more importantly, ~20% of the clusters that will 
be detected by the SZ experiments (above a dark matter 
mass of 3 x 10 14 M ) are at z < 0.4. If redshift up to 
z = 1 were accessible by GALEX, ~60% of the age of 
the universe and a volume of 153 Gpc 3 would be sur- 
veyed; 86% of the clusters that will be detected by the 
SZ experiments (above a dark matter mass of 3 x 10 14 
Mq) and 90% of the resolved ones, are expected to be at 
z < 1. 

The most important aspect of the results presented 
here, is that the photo-z accuracy of a z = 0.04(1 + z) at 
z < 0.4 was obtained using only maximum-likelihood fits 
to six galaxy templates in BPZ, without resorting to pri- 
ors or training-sets. As the acquisition of training sets or 
priors relies on obtaining large spectroscopic data-sets, 
we consider the moderate GALEX exposures an efficient 
way to obtain accurate photo-z over large areas. 22 Fur- 
ther, GALEX photometry gives a direct measurement of 

22 Note that we just integrated ~2.4 days and that, for example, 
a program 10 times longer could provide photo-z for about 1000 
deg 2 , which (we estimate) is the optimal area to extract cosmolog- 
ical information from SZ surveys 
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star formation and AGN activity (|Atlee fc Gouldl 12007]) . 
a subject that we are continuing to explore. 

Should large spectroscopic training sets be available, 
we have considered the effect of adding UV photometry 
to optical data on the performance of an artificial neural 
network photo- z calculation. The addition of GALEX 
observations to optical griz (or even just gri) observa- 
tions yields photo-z that have a z = 0.02(1 + z), which 
is ~30% smaller scatter than was obtained on the same 
data set using only SDSS ugriz observations. 

We make our photo-z catalogs of stripe 82 
galaxies detected by GALEX publicly available at 
www.ice.csic.es/personal/jimenez/PHOTOZ. The cata- 
logs contain the results of the ML photo-z calculation 
for all GALEX detected galaxies with r < 21 as well as 
the ANNzG and ML photo-z calculations for all SDSS 
stripe 82 galaxies in GALEX fields with r < 19. The 
posted catalogs will be updated as our GALEX observa- 
tions and analysis are completed. 

The approach proposed here can provide a useful cat- 
alog for weak-lensing studies as photo-z remain accu- 
rate for the bluest galaxies. These determinations are 
commonly the most difficult to obtain because spec- 
tra of blue galaxies in the optical bands show an al- 
most featureless power law spectral energy distribution. 
We envision our SED-based ML approach to be use- 
ful for cross-correlation studies with CMB maps, where 
deep photo-z are needed over large areas and large 
training sets may not be available. Possible applica- 
tions of these studies include improving our understand- 
ing of dark energy using cluster coun ting techniques 
' Carlstrom et al.ll2005t [Lima fc Hull2007f) . the kSZ effect 



Hernandez- Monteagudo et al.ll2006l), and the lensing o f 



the CMB by large-scale structure ( Carbone et al.l l2007) 
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